Bayesian decision tree state tying for conversational speech recognition

نویسندگان

  • Rusheng Hu
  • Yunxin Zhao
چکیده

This paper presents a new method of constructing phonetic decision trees (PDTs) for acoustic model state tying based on implicitly induced prior knowledge. Our hypothesis is that knowledge on pronunciation variation in spontaneous, conversational speech contained in a relatively large corpus can be used for building domain-specific or speaker-dependent PDTs. In the view of tree structure adaptation, this method leads to transformation of tree topology in contrast to keeping fixed tree structure as in traditional methods of speaker adaptation. A Bayesian learning framework is proposed to incorporate prior knowledge on decision rules in a greedy search of new decision trees, where the prior is generated by a decision tree growing process on a large data set. Experimental results on the Telemedicine automatic captioning task demonstrate that the proposed approach results in consistent improvement in model quality and recognition accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Decision tree state tying based on penalized Bayesian information criterion

In this paper, an approach of penalized Bayesian information criterion (pBIC) for decision tree state tying is described. The pBIC is applied to two important applications. First, it is used as a decision tree growing criterion in place of the conventional approach of using a heuristic constant threshold. It is found that original BIC penalty is too low and will not lead to compact decision tre...

متن کامل

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

Enhanced tree clustering with single pronunciation dictionary for conversational speech recognition

Modeling pronunciation variation is key for recognizing conversational speech. Rather than being limited to dictionary modeling, we argue that triphone clustering is an integral part of pronunciation modeling. We propose a new approach called enhanced tree clustering. This approach, in contrast to traditional decision tree based state tying, allows parameter sharing across phonemes. We show tha...

متن کامل

Robust decision tree state tying for continuous speech recognition

In this paper, methods of improving the robustness and accuracy of acoustic modeling using decision tree based state tying are described. A new two-level segmental clustering approach is devised which combines the decision tree based state tying with agglomerative clustering of rare acoustic phonetic events. In addition, a unified maximum likelihood framework for incorporating both phonetic and...

متن کامل

Bayesian context clustering using cross valid prior distribution for HMM-based speech recognition

Decision tree based context clustering [Young; '94] ・ Construct a parameter tying structure ・ Can estimate robust parameter ・ Can generate unseen context dependent models ・ Minimum description length (MDL) criterion [Shinoda; '97] Bayesian approach ・ Variational Bayesian (VB) method [Attias; '99] ⇒ Applied to speech recognition [Watanabe; '04] ・ Can use prior information ⇒ Affect context cluste...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006